Picture for Ge Zhang

Ge Zhang

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities

Add code
Jan 29, 2026
Viaarxiv icon

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Add code
Jan 29, 2026
Viaarxiv icon

TabularMath: Evaluating Computational Extrapolation in Tabular Learning via Program-Verified Synthesis

Add code
Jan 25, 2026
Viaarxiv icon

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Add code
Jan 18, 2026
Viaarxiv icon

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Add code
Jan 13, 2026
Viaarxiv icon

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

Add code
Dec 31, 2025
Viaarxiv icon

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Add code
Dec 31, 2025
Viaarxiv icon

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning

Add code
Dec 28, 2025
Viaarxiv icon

AInsteinBench: Benchmarking Coding Agents on Scientific Repositories

Add code
Dec 24, 2025
Viaarxiv icon

CodeSimpleQA: Scaling Factuality in Code Large Language Models

Add code
Dec 22, 2025
Viaarxiv icon